Search results for "Computer Science - Learning"

showing 10 items of 18 documents

Optimal rates of convergence for persistence diagrams in Topological Data Analysis

2013

Computational topology has recently known an important development toward data analysis, giving birth to the field of topological data analysis. Topological persistence, or persistent homology, appears as a fundamental tool in this field. In this paper, we study topological persistence in general metric spaces, with a statistical approach. We show that the use of persistent homology can be naturally considered in general statistical frameworks and persistence diagrams can be used as statistics with interesting convergence properties. Some numerical experiments are performed in various contexts to illustrate our results.

Computational Geometry (cs.CG)FOS: Computer and information sciences[ MATH.MATH-GT ] Mathematics [math]/Geometric Topology [math.GT][STAT.TH] Statistics [stat]/Statistics Theory [stat.TH]Topological Data analysis Persistent homology minimax convergence rates geometric complexes metric spacesGeometric Topology (math.GT)Mathematics - Statistics TheoryStatistics Theory (math.ST)[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG][STAT.TH]Statistics [stat]/Statistics Theory [stat.TH][INFO.INFO-CG]Computer Science [cs]/Computational Geometry [cs.CG][ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH][ INFO.INFO-LG ] Computer Science [cs]/Machine Learning [cs.LG]Machine Learning (cs.LG)Computer Science - LearningMathematics - Geometric Topology[INFO.INFO-CG] Computer Science [cs]/Computational Geometry [cs.CG][INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG][MATH.MATH-GT]Mathematics [math]/Geometric Topology [math.GT]FOS: Mathematics[ INFO.INFO-CG ] Computer Science [cs]/Computational Geometry [cs.CG]Computer Science - Computational Geometry[MATH.MATH-GT] Mathematics [math]/Geometric Topology [math.GT]
researchProduct

Remote Sensing Image Classification with Large Scale Gaussian Processes

2017

Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier (GPC), since it approaches the classification problem with a solid probabilistic treatment, thus yielding confidence intervals for the predictions as well as very competitive results to state-of-the-art neural networks and support vector machines. However, its computational cost is prohibitive for…

FOS: Computer and information sciences010504 meteorology & atmospheric sciencesComputer scienceMultispectral image0211 other engineering and technologiesMachine Learning (stat.ML)02 engineering and technologyLand cover01 natural sciencesStatistics - ApplicationsMachine Learning (cs.LG)Kernel (linear algebra)Bayes' theoremsymbols.namesakeStatistics - Machine LearningApplications (stat.AP)Electrical and Electronic EngineeringGaussian process021101 geological & geomatics engineering0105 earth and related environmental sciencesRemote sensingContextual image classificationArtificial neural networkData stream miningProbabilistic logicSupport vector machineComputer Science - LearningKernel (image processing)symbolsGeneral Earth and Planetary Sciences
researchProduct

A probabilistic estimation and prediction technique for dynamic continuous social science models: The evolution of the attitude of the Basque Country…

2015

In this paper, a computational technique to deal with uncertainty in dynamic continuous models in Social Sciences is presented.Considering data from surveys,the method consists of determining the probability distribution of the survey output and this allows to sample data and fit the model to the sampled data using a goodness-of-fit criterion based the χ2-test. Taking the fitted parameters that were not rejected by the χ2-test, substituting them into the model and computing their outputs, 95% confidence intervals in each time instant capturing the uncertainty of the survey data (probabilistic estimation) is built. Using the same set of obtained model parameters, a prediction over …

FOS: Computer and information sciencesAttitude dynamicsProbabilistic predictionComputer sciencePopulationDivergence-from-randomness modelSample (statistics)computer.software_genreMachine Learning (cs.LG)Probabilistic estimationSocial scienceeducationProbabilistic relevance modeleducation.field_of_studyApplied MathematicsProbabilistic logicConfidence intervalComputer Science - LearningComputational MathematicsSocial dynamic modelsProbability distributionSurvey data collectionData miningMATEMATICA APLICADAcomputerApplied Mathematics and Computation
researchProduct

Optimized Kernel Entropy Components

2016

This work addresses two main issues of the standard Kernel Entropy Component Analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of by variance as in Kernel Principal Components Analysis. In this work, we propose an extension of the KECA method, named Optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular…

FOS: Computer and information sciencesComputer Networks and CommunicationsKernel density estimationMachine Learning (stat.ML)02 engineering and technologyKernel principal component analysisMachine Learning (cs.LG)Artificial IntelligencePolynomial kernelStatistics - Machine Learning0202 electrical engineering electronic engineering information engineeringMathematicsbusiness.industry020206 networking & telecommunicationsPattern recognitionComputer Science ApplicationsComputer Science - LearningKernel methodKernel embedding of distributionsVariable kernel density estimationRadial basis function kernelKernel smoother020201 artificial intelligence & image processingArtificial intelligencebusinessSoftwareIEEE Transactions on Neural Networks and Learning Systems
researchProduct

Simplifying Probabilistic Expressions in Causal Inference

2018

Obtaining a non-parametric expression for an interventional distribution is one of the most fundamental tasks in causal inference. Such an expression can be obtained for an identifiable causal effect by an algorithm or by manual application of do-calculus. Often we are left with a complicated expression which can lead to biased or inefficient estimates when missing data or measurement errors are involved. We present an automatic simplification algorithm that seeks to eliminate symbolically unnecessary variables from these expressions by taking advantage of the structure of the underlying graphical model. Our method is applicable to all causal effect formulas and is readily available in the …

FOS: Computer and information sciencesComputer Science - Artificial Intelligencegraph theoryyksinkertaisuussimplificationgraphical modelMachine Learning (stat.ML)Machine Learning (cs.LG)Computer Science - Learningprobabilistic expressionArtificial Intelligence (cs.AI)Statistics - Machine Learningkausaliteettipiirrosmerkitcausal inferencegraafit
researchProduct

Anomaly Detection Framework Using Rule Extraction for Efficient Intrusion Detection

2014

Huge datasets in cyber security, such as network traffic logs, can be analyzed using machine learning and data mining methods. However, the amount of collected data is increasing, which makes analysis more difficult. Many machine learning methods have not been designed for big datasets, and consequently are slow and difficult to understand. We address the issue of efficient network traffic classification by creating an intrusion detection framework that applies dimensionality reduction and conjunctive rule extraction. The system can perform unsupervised anomaly detection and use this information to create conjunctive rules that classify huge amounts of traffic in real time. We test the impl…

FOS: Computer and information sciencesComputer Science - LearningComputer Science - Cryptography and SecurityCryptography and Security (cs.CR)Machine Learning (cs.LG)
researchProduct

Ensembles of Randomized Time Series Shapelets Provide Improved Accuracy while Reducing Computational Costs

2017

Shapelets are discriminative time series subsequences that allow generation of interpretable classification models, which provide faster and generally better classification than the nearest neighbor approach. However, the shapelet discovery process requires the evaluation of all possible subsequences of all time series in the training set, making it extremely computation intensive. Consequently, shapelet discovery for large time series datasets quickly becomes intractable. A number of improvements have been proposed to reduce the training time. These techniques use approximation or discretization and often lead to reduced classification accuracy compared to the exact method. We are proposin…

FOS: Computer and information sciencesComputer Science - LearningComputingMethodologies_PATTERNRECOGNITIONMachine Learning (cs.LG)
researchProduct

Renewable Energy Prediction using Weather Forecasts for Optimal Scheduling in HPC Systems

2014

The objective of the GreenPAD project is to use green energy (wind, solar and biomass) for powering data-centers that are used to run HPC jobs. As a part of this it is important to predict the Renewable (Wind) energy for efficient scheduling (executing jobs that require higher energy when there is more green energy available and vice-versa). For predicting the wind energy we first analyze the historical data to find a statistical model that gives relation between wind energy and weather attributes. Then we use this model based on the weather forecast data to predict the green energy availability in the future. Using the green energy prediction obtained from the statistical model we are able…

FOS: Computer and information sciencesComputer Science - LearningPhysics::Atmospheric and Oceanic PhysicsMachine Learning (cs.LG)
researchProduct

Forecasting : theory and practice

2022

Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a varie…

FOS: Computer and information sciencesComputer Science - Machine LearningTime seriesEconomicsApplicationOther Engineering and Technologies not elsewhere specifiedEconometrics (econ.EM)HAMethodMachine Learning (stat.ML)ReviewStatistics - ApplicationsMachine Learning (cs.LG)FOS: Economics and businessBusiness and EconomicsStatistics - Machine LearningMethodsPrincipleREVIEWApplications (stat.AP)Övrig annan teknikN100Business and International ManagementNationalekonomiEconomics - EconometricsBusiness AdministrationFöretagsekonomiAPPLICATIONSOther Statistics (stat.OT)Wirtschaftswissenschaftenstat.OTStatistics - Other StatisticsComputer Science - Learning003: SystemePRINCIPLESecon.EMApplicationsMETHODSStatistics - Applications; Statistics - Applications; Computer Science - Learning; econ.EM; Statistics - Machine Learning; stat.OTEncyclopediaPredictionPrinciplesREVIEW ENCYCLOPEDIA METHODS APPLICATIONS PRINCIPLES TIME SERIES PREDICTIONForecasting
researchProduct

Probabilistic and team PFIN-type learning: General properties

2008

We consider the probability hierarchy for Popperian FINite learning and study the general properties of this hierarchy. We prove that the probability hierarchy is decidable, i.e. there exists an algorithm that receives p_1 and p_2 and answers whether PFIN-type learning with the probability of success p_1 is equivalent to PFIN-type learning with the probability of success p_2. To prove our result, we analyze the topological structure of the probability hierarchy. We prove that it is well-ordered in descending ordering and order-equivalent to ordinal epsilon_0. This shows that the structure of the hierarchy is very complicated. Using similar methods, we also prove that, for PFIN-type learning…

FOS: Computer and information sciencesComputer Science::Machine LearningTheoretical computer scienceComputer Networks and CommunicationsExistential quantificationStructure (category theory)DecidabilityType (model theory)Learning in the limitTheoretical Computer ScienceMachine Learning (cs.LG)Probability of successFinite limitsMathematicsOrdinalsDiscrete mathematicsHierarchybusiness.industryApplied MathematicsAlgorithmic learning theoryProbabilistic logicF.1.1 I.2.6Inductive inferenceInductive reasoningDecidabilityComputer Science - LearningTeam learningComputational Theory and MathematicsArtificial intelligencebusinessJournal of Computer and System Sciences
researchProduct